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ABSTRACT 

Statistics of the weak lensing of galaxies can be used to constrain cosmology if the 
galaxy shear can be estimated accurately. In general this requires accurate modelling 
of unlensed galaxy shapes and the point spread function (PSF). I discuss suboptimal 
but potentially robust methods for estimating galaxy shear by stacking images such 
that the stacked image distribution is closely Gaussian by the central limit theorem. 
The shear can then be determined by radial fitting, requiring only an accurate model 
of the PSF rather than also needing to model each galaxy accurately. When noise 
is significant asymmetric errors in the centroid must be corrected, but the method 
may ultimately be able to give accurate un-biased results when there is a high galaxy 
density with constant shear. It provides a useful baseline for more optimal methods, 
and a test-case for estimating biases, though the method is not directly applicable to 
realistic data. I test stacking methods on the simple toy simulations with constant PSF 
and shear provided by the GREAT08 project, on which most other existing methods 
perform significantly more poorly, and briefly discuss generalizations to more realistic 
cases. In the appendix I discuss a simple analytic galaxy population model where 
stacking gives optimal errors in a perfect ideal case. 



1 INTRODUCTION 



Gravitational lensing of light from distant galaxies causes 
the shape of the galaxies to be distorted in a way that 
depends on the transverse gradi ents of the gravitational 
poten tial along the line of sight (Bartelmann & Schneider 
[2001] ) . If the distortion can be measured accurately, it gives 



a constraint on the lensing potentials, and hence with large 
enough number of samples on the geometry and distribution 
of perturbations in the universe. Since the galaxy shapes 
vary greatly, this can only be done by analysing a very large 
number of galaxies, with galaxies that are sufficiently well 
separated that their intrinsic shape correlations can be mod- 
elled out or is small. The galaxies can then be assumed to 
be independent, so that any shape correlation is due entirely 
to lensing. The task is to find a way to estimate the lens- 
ing distortion, which can then be used to extract statistical 
results from an ensemble of galaxy images. 

At leading order the main observable distortion is that 
of galaxy shear. As discussed further below, if we could 
observe the galaxies directly, fitting any sheared profile to 
each galaxy will give an unbiased estimator of this shear. 
The problem is however much more complicated, because 
in practice we can only measure the shape after convolution 
with the point spread function (PSF) of the instrument (e.g. 
due to atmospheric fluctuations and instrumental imperfec- 
tions), and image pixelization. The levels of shear that are 
expected — a few percent — are comparable to those of typi- 
cal PSFs, so the PSF must be modelled very accurately in or- 
der to isolate the cosmological signal. Since the PSF breaks 
the symmetries of the problem, in general this requires ac- 



curate modelling of both the unlensed galaxy shapes and 
the PSF. Finding methods of doing this that work to the re- 
quired level of precision is an active area of current research. 
At the moment it unclear whether it is even possible to get 
useful high-precision shear constraints in the presence of re- 
alistic ground-observation PSFs, or whether in fact there 
are unavoidable degeneracies with galaxy shapes and PSF 
modelling uncertainties. The correct statistical error on the 
shear measurement could also be too large for the number 
of observable galaxies to produce precision constraints. 

In thi s pape r I re-visit an old sub-optimal method 
of Kuijken (1999): stacking galaxy images. If the intrinsic 
galaxy shapes are uncorrelated, a stacked unlensed image 
should have circular symmetry. Since convolution is a lin- 
ear operation, the observed stacked image should then be 
a PSF-convolved sheared version of a circularly symmetric 
average galaxy. If the PSF is known, the only modelling un- 
certainties are then in the averaged galaxy profile, which 
should be well determined by the data. Furthermore, un- 
der fairly general conditions a sum of independent samples 
should have a close-to-Gaussian distribution by the central 
limit theorem, so the statistics of the stacked image is known 
without needing to know anything about the distribution of 
individual galaxy shapes. Fitting a radial profile and shear 
to the data with a Gaussian error model gives an estimate of 
the shear that should be very independent of the actual dis- 
tribution of galaxy shapes. The method therefore provides a 
useful baseline for comparing future more optimal methods 
that incorporate accurate modelling of individual galaxies. 

In practice of course things are not so simple. To start 
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with the shear and PSF are not expected to be constant, so 
any stacked galaxy image has to be interpreted with care. In 
addition, in the presence of noise, the process of stacking can 
itself produce biases since we cannot determine the centroid 
accurately: any shear- or PSF-correlated misalignments in 
the stacking procedure will introduce biases. 

Given the complexity of the general problem, the lens- 
ing community has helpfully boiled the issues down into a se- 



ries of much simpler problems ( Heymans et al 
et al. 



2 307; Bridle et al 



2006 



Massey 



2008). Although existing methods 



perform adequately for current and near-future data, even 
in these highly simplified cases they are known to be inade- 
quate for future surveys. I therefore focus on these simplified 
problems to try to isolate the important issues, in particular 
I shall assume the PSF is well measured from many low-noise 
star images and that shear is constant. If no methods works 
accurately even on this very simple toy problem, then that 
is clearly sufficient to show that ground-based weak lens- 
ing surveys with similar PSFs will be of no use for precision 
cosmology (i.e. future cosmological parameter constraints at 
the percent level or better). On the other hand if sufficiently 
accurate methods can be developed, the next task will be to 
make them applicable to more realistic situations where the 
PSF is likely to vary significantly and the shear has a real- 
istic spatial correlation function. Space-based observations 
typically have rather different PSFs and would require a 
separate study. 

I start by reviewing the case of shape estimation when 
there is no PSF, and then briefly explain why introducing 
a PSF qualitatively increases the complexity of the prob- 
lem. I then move on to show that stacking can work well 
with low-noise simulations, and discuss various issues to do 
with pixel-scale stacking, centroid errors and non-constant 
PSFs. I test stacking methods on the GREAT08F] simula- 



tions (Bridle et al. 2008) and show that it performs well 



compared to other existing methods, most of which involve 
modelling unlensed galaxy shape distributions with some- 
thing that is known to be incorrect. Unlike other existing 
methods the stacking method is not directly applicable to 
more realistic data, but may be useful to motivate more 
general approaches. 



2 FITTING GALAXIES AND SHEAR 
2.1 Shear fitting with no PSF 

At lowest order in the gravitational potentials weak lensing 
causes position x u on the unlensed image to be related to 
the corresponding position X; on the lensed image by 



x u = Sx; 
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(1) 



where the components of the shear matrix gi and gi are the 
reduced shear in some coordinate system. For the purposes 
of this paper we can neglect the uniform convergence which 
is degenerate with the galaxy size and assume g\ , g^ are con- 
stant across each ga laxy image. For a thorough intro d uction 
to weak lensing see 



k Challinor (2006) 



Bartelmann k Schneider (2001) 



Lewis 



Consider fitting a model m(S' m x, 6*) to an unlensed per- 
fect galaxy image I u (x) , with model parameters 8 and shear 
matrix S m - For example a simple least-squares fit would 
solve 

f cPx|J«(x)-m(,S m x,0)| 2 = O 



dS„ 



d 2 x\I u (x)~m(S m x,8)\ 2 = 0. 



(2) 



Assuming there is a unique solution S m = So, 8 = 8, the 
lensed image Ii(x) = I u {Sx) would then be fit by m(Sx,8) 
where S = SoS. The best-fit unlensed shear matrix So is 
determined by the particular galaxy and model. The key as- 
sumption in galaxy weak lensing is that the galaxy shapes 
are statistically isotropic, in other words that versions of 
each unlensed galaxy rotated by different angles (or flipped) 
are equally likely. So R T SoR is just as likely as So for 
some rotation matrix R. Taking it to be a rotation by 
90°, on average over many galaxy orientations we have, 
{So) =^{S + R T S R) = -T, and hence (S) = S: the shear 
matrix estimator is unbiased. Note that this is entirely in- 
dependent of how well m(S ra x, 8) actually fits the unlensed 
galaxy, so in the idealized case we could fit any model we 
like to galaxy shapes and still on average get the correct 
answer. This will remain true for best-fits to more general 
log-likelihoods of the form 



\-"= / d 2 x(/ i (x)-m(S m x,e)) J [JV(Ji(x),S m x)P 

x (/|(x)-m(SmX,tf)) 1 (3) 

i.e. where the noise depends only on the lensed galaxy inten- 
sity or follows the alignment of the galaxy model. Similarly 
for generalizations with correlated noise. 



2.2 Shear fitting with a PSF 

Unfortunately we cannot observed lensed galaxies directly, 
but only after convolution with an instrumental point spread 
function and pixelization. Pixelization can be though of as 
an additional contribution to the PSF, typically a convolu- 
tion with a square window function, followed by sampling 
at the pixel centres. I shall discuss the PSF in this general- 
ized sense, so that the observational data consists of a set 
of regularly-spaced samples of a PSF and pixel-convolved 
galaxy image. The noise-free observed value at position x 
on the image plane is then 



J„(x) = j d 2 yP(> 



-y)/ ; (y), 



(4) 



where P(x) is the total PSF, or simply I a = P-kli . If we know 
the PSF function, we can fit a PSF-convolved galaxy model 
to the data; for example a least-squares solution would min- 
imize 

2 

X = 



J d 2 x(/ D (x)-^ d 2 yP(x-y)m(S m y 
jd 2 * J d 2 yP(x-y){I u (Sy)-m(S m y,8)} 



.(5) 



1 http : //www. great08challenge . info/ 



If S m = So, 8 = 8 is the best fit when there is no lensing, 
due to the position dependence of the PSF it is no longer 
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in general the case that S m = SqS,9 = 9 is the best fit 
to the lensed image. Hence unlike in the case with no PSF, 
there is no longer any guarantee that fitting is giving an 
unbiased estimate of the shear. The only general exception 
is if the model fits the galaxy exactly, so the best fit has 
I u (Sy) = m(S m y,9), in which case fitting is giving the 
right answer on average independent of the known PSF. 

Extracting unbiased shear constraints by model fitting 
in the presence of a PSF therefore in general requires mod- 
elling the galaxies accurately. This poses several significant 
problems. A large galaxy lensing survey will have most of 
its galaxies near the edge of its resolution, therefore there 
is typically only limited high-quality data to constrain the 
properties of the bulk of the galaxies in the selection func- 
tion. A general Bayesian model could use information from 
some well-resolved galaxies, and fit a general model for un- 
certainties, but given the large variation in galaxy alignment 
with respect to the line of sight, and wide variations in the 
intrinsic shapes, a general model is likely to involve a large 
number of parameters and require many images to constrain 
well. The galaxy model can also be constrained to some ex- 
tent using all of the observed galaxies. But if the param- 
eters can not all be well constrained by the data, it may 
be essential that the priors accurately represent the galaxy 
distributions in order to get unbiased answers. In addition 
any model with large numbers of parameters per galaxy is 
likely to become numerically time consu ming. For an excel- 



lent discussion of 



many 



related issues see 



isting methods in 



Bernstein & Jarvis 



( 2002); Hirata fc Seljak ( 2003 ), and a summary of other ex- 



Bridle et al 



results on Baye s ian m odel fitting see 



( 200£ ) . For promising recen t 
~ (p007h; 



Miller et al 



Kitching et al. (2008), however the galaxy model used in 



this method is still unrealistic and results, though signifi- 
cantly better than many other methods, are still not good 
enough for high-precision cosmology (see Sec. 0). 



3 SHEAR FITTING STACKED GALAXIES 

General galaxy fitting should provide the best constraints 
on the shear. However given the problems outlined above, 
and given potential difficulties in knowing whether the mod- 
elling is accurate enough, it would be useful to have a simple 
less-optimal but more robust shear-estimation method that 
is more directly independent of the details of the galaxy dis- 
tribution. In simple test cases this would be a useful cross- 
check, and provide a baseline for the levels of residual noise 
that better methods should be able to beat. 

The method I shall focus on simply stacks the galaxies, 
and then fits a s heared av e rage galaxy model to the stacked 
image, following Kuijkcn (1999). If the PSF is known, this 
should give unbiased results conditional only on being able 
to stack in an unbiased way, and being able to model the 
radial profile of the averaged unlensed galaxy accurately. A 
1-dimensional radial model is clearly much easier to fit that 
a full 2D galaxy shape distribution, and since the average 
galaxy is expected to have a smooth radial profile only a 
modest number of parameters should be required. These pa- 
rameters are likely to be well constrained with a reasonable 
number of galaxies (and hence the results fairly independent 
of the priors) . 

In order to stack galaxies images, we need to be able to 



define a rule for the relative galaxy alignment, e.g. by defin- 
ing a centroid in each image and then stacking the images so 
that their centroids are aligned. Assuming this can be done, 
we then have an observed stack of N galaxy images 



N N 

£A^(x) = --$>^*/ M ](x), 



N 



(6) 



where /3i are some weights and Pi the PSF on galaxy i. 
Assuming the PSF is independent of the galaxy shape the 
expected value of the stacked image is 



N 



1,13 



P* J, 



1,0, 

(7) 

where is the average of a weighted galaxy. By symme- 
try, taking x to have origin at the centroid, in the unlensed 
case J Ull a(x) = J„„g(|x|). The average PSF P — including 
pixelization — is precisely what is observed from a large sta- 
tistically equivalent set of star images (assuming stars are 
point sources and have the same PSF as the galaxies). 

Assuming the weights are independent of the shear and 
the shear is constant, the expectation of the stacked im- 
age is a sheared circularly-symmetric averaged galaxy, con- 
volved with an average PSF. We can therefore proceed to 
fit a model to the observed stacked image, and if the radial 
profile can be fit accurately the method should be unbiased. 
In the appendix I discuss a simple analytic galaxy popula- 
tion model in which, for the ideal noise-free case, stacking 
with an appropriate weighting is in fact optimal. 

In the presence of noise, and with finite N so that there 
is dispersion about the expectation value, we need an error 
model. One benefit of using stacked images is that this is 
well defined: assuming each galaxy is independent, the dis- 
tribution of / , a sum of many independent galaxy samples, 
should be nearly Gaussian by the central limit theorem. 

In fact we can apply any linear function to I , and the 
distribution will still be Gaussian with expectation given 
by the equivalent linear function of the averaged convolved 
galaxy. This can be useful for data compression, e.g. to re- 
pixelize, or expand in moments, etc, anything that's likely to 
encapsulate most of the useful information in fewer numbers. 
If the stacked image is generated at much higher pixel sam- 
pling than the original image, there will be a large number of 
pixel values and hence a huge number of galaxies required 
for the covariance estimate to be accurate. Also since the 
noise is correlated on the scale of the original pixel size, the 
covariance would be singular, so applying some linear re- 
pixelization or other linear compression matrix M can be 
useful. Writing the set of sampled x values as a vector, for 
a data vector X = MI the covariance Cx = (XX T ) can 
be estimated from N galaxy samples as 



Cx = --j]£M(/Wo 1 i-i )(ftIo,< 



I, 



) T M T 



(8) 



(for large N, N 2> dim(X)). The likelihood as a function of 
parameters 9 and shear matrix S can then be approximated 
as 

-21n£(S,0) ~ [lo-ra {S.9)] T M T C x 1 M[i -m {S.9)] 

(9) 

where m o (S,0) — P*m(S,6) is the model for the av- 
erage PSF-convolved sheared circularly-symmetric galaxy. 
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The simplest thing is to take M to just re-pixelize, e.g. at 
the original pixel resolution. For simple PSFs it also loses 
little information to halve the number of points by taking M 
to sum 7(x) and 7(— x) since this includes the shear and ra- 
dial information, but ignores irrelevant dipole fluctuations. 
Note that Cx includes variance due to both noise and 'shape 
noise' due to the differences in galaxy shapes. The latter is 
expected to be spatially correlated even if the noise is not, 
but in any case the central limit theorem result straightfor- 
wardly accounts for any correlated or non-Gaussian noise in 
individual images. 

In the limit that the model fits the stacked image ex- 
actly, and the stated assumptions are met, the fitting proce- 
dure should be unbiased. However due to noise this will not 
quite be the case, so there is potentially a source of noise- 
bias through the PSF and shear dependence of the estimated 
covariance. 

Note that even if the instrumental PSF is actually con- 
stant, the PSF for points in the stacked image plane varies 
from galaxy to galaxy due to the offset between the centres 
of the pixels in each image and the centre of the stacked 
image. If the stacked image is pixelized at higher resolution, 
then there are different PSFs for each non-equivalent high- 
resolution pixel centre. However the averaged PSF will be 
the same for each high-resolution pixel. The high resolution 
pixels are of course strongly correlated due to the pixeliza- 
tion of the individual images. 
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Figure 1. Typical residuals after fitting a unit-amplitude sheared 
circularly-symmetric galaxy to a stacked image. Note errors are 
correlated due to pixel-scale stacking correlations and correlated 
shape noise. Here pixels have been added at 15 sub-pixel resolu- 
tion, less than needed for accurate results. 



3.1 Centroid issues 

Centroid errors on the galaxy image plane are harmless 
(other than increasing the error bars), since they merely 
effect the average galaxy profile. For example we could con- 
sider defining the centroid in terms of a random displace- 
ment from the centre of light in the unlensed galaxy, which 
is perfectly legitimate. The problem is that in general the 
centroid error will depend on the galaxy shape, and hence 
also shear and PSF in non-trivial way. Since the centroid 
of a long thin shape is hard to determine in the long direc- 
tion, the centroid error is typically strongly correlated with 
the shape of the galaxy; if the galaxies have a net ellipticity 
in one direction due to the PSF, the centroids will tend to 
have a net dispersion aligned with the PSF. This has the 
effect of making a naively stacked image give results biased 
in the direction of the PSF. There are similar effects due to 
shear. When the centroid error is not negligible compared 
to the galaxy sizes, the centroid error must be accounted 
for somehow in order to get unbiased results. In general this 
is difficult, though an approximate correction may be suffi- 
cient. 

Two simple approaches immediately present themself. 
We could simply attempt to model the effective centroid- 
error PSF and include it as part of the effective PSF on the 
stacked image. Or the centroid error could be modified to 
remove some of the sources of bias. The latter approach is 
likely to be more straightforward, if less optimal. 

As a crude first attempt to remove the leading-order 
centroid bias I simply add Gaussian noise to each centroid 
so that the total centroid dispersion is approximately cir- 
cularly symmetric. To do this I fit a 6-parameter Gaussian 
elliptical model to each observed (PSF-convolved) galaxy 
to determine the centroid, calculating the Hessian errors by 



numerical differentiation and then inverting to get an ap- 
proximate centroid error matrix. Then to each estimated 
centroid I add a small Gaussian displacement in a direction 
chosen such that the total centroid error is then isotropic. 
If the estimate of the centroid error on each galaxy is fairly 
accurate, this should remove the correlation of centroid dis- 
persion with galaxy alignment, and hence reduce the PSF 
bias. However the magnitude of the centroid error will still 
depend on the PSF-convolved sheared galaxy shape, and 
hence potentially lead to residual biases. Furthermore if the 
total centroid dispersion is accounted for by allowing the av- 
erage galaxy profile to change, the centroid error really has 
to be sheared like the rest of the galaxy shape; a better ap- 
proach could therefore use an approximate estimate of the 
shear to ensure that the total centroid error on the image 
plane is sheared approximately correctly^]. 

The centroid determined by Gaussian model fitting that 
I use seems to have about 10% le ss dispersion than that ob - 
tained using adaptive moments (Bernstein & Jarvis 2002), 
however in the presence of more complicated PSF (e.g. with 
a dipole) the position of the centroid could be biased, so a 
more sophisticated method may be required. Unfortunately 
to get the centroid error correct in general requires mod- 
elling the shape of each galaxy correctly, which is just as 
hard as the original shape estimation problem. However as 
long as the centroid error is small compared to the size of 
the galaxies, an approximate correction may be sufficient. 
Indeed the o utput from a more r ealistic galaxy fitting code 
like Lensfit (Kitching et al. 2008) might be a good place to 



start trying to improve the crude Gaussian model used here. 
Simulations may also be reliable enough to find a fudge pa- 
rameter to relate the estimated centroid error to the true 
centroid error to the required accuracy. 
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4 RESULTS WITH SIMULATIONS 

I test the galaxy stacking method on simulati ons provided 
by the GREAT08 project ( ftridle et aT| |2008| ). These sat- 
isfy the required assumptions, in that the shear is constant 
over a larger number of galaxy images. The simulations also 
have constant (very simple) PSF, with a large number of 
low-noise star images so that the PSF can be determined 
essentially exactly. The PSF is anisotropic but has no dipole 
moment and the isophotes have the same shape at each ra- 
dius; it is therefore a rather special case and is likely to be 
unrealistic in several important respects. Nonetheless, even 
with these radical simplifications from reality, most exist- 
ing shear-estimation methods fail to produce results at an 
accuracy required for precision cosmology, so it makes an 
interested test case. 

The disadvantage of non-optimal methods such as 
stacking is there are lots of free parameters, e.g. choice of /3; 
and radial fitting function, choice of the M reduction ma- 
trix, as well as resolution parameters governing the stacking. 
The weights pi must be chosen in a shape-independent (or at 
least alignment-independent) manner, otherwise biases may 
be introduced. I take to constant or inversely proportional 
to the integrated signal in each images (so that the average 
galaxy is then independent of the magnitude distribution of 
the galaxies); see the Appendix for a discussion of the op- 
timal weighting in an idealized case. For noisy images this 
should probably be modified by an estimate of the signal to 
noise ratio do down-weight noise-dominated images. I chose 
M simply to re-pixelize the stacked image to the resolution 
of the original galaxies. 

Parameterizing radial distribution of m(S, 9) using 
splines is convenient, so 9 is a set of values at some radial 
spline nodes. Splines naturally have multiple resolution: e.g. 
we can do a quick fit with a few spline points, then increase 
the number of spline parameters to refine the result. This 
could be done in an adaptive way to make sure the data 
is fit but not over-fit. I simply choose to spline in the log 
of the radial amplitude, using 12 spline points over a ra- 
dius range of 8 pixel units for fitting the stacked image, 
with stacked-image resolution 1/31 or 1/41 of the original 
pixel size (with initial fit at 1/9 resolution with 7 spline 
points to get close to the best-fit point quickly). The fit- 
ting could be done with MCMC to get accurate error bars, 
though at a quick look does not show evidence of strong de- 
generacies or asymmetries in the error bars. So finding just 
the best fit is a reasonable first step, with errors approxi- 
mated from a Hessian if required. To find the best-fit point 
I use the NEWUOA^] algorithm, which can be sig nificantly 
faster than AMOEBA' downhill-simplex method (Nelder & 



Mead 



1965) in many cases, though I need to be a bit careful 
to avoid local minima. The resulting reduced chi-squared is 
generally less than one, indicating that indeed the galaxy is 
being fit accurately, though values are hard to assess due to 
the non-realistic mirroring procedure used in the GREAT08 
simulations to help reduce shape noise. Typical residuals are 
show in Fig. [I]. 

For noisy images the centroid error needs to be cor- 



rected as discussed in the previous section. Comparison with 
the test simulation indicates that the centroid variance is 
underestimated by about a factor of a half, so I adopt a 
centroid-error fudge parameters a = 1.5 (chosen to work 
with the test simulations), and assume that the actual cen- 
troid covariance on each galaxy is aC where C is estimated 
from the Hessian about the best-fit Gaussian model. 

Accuracy of results for the purpose of GREAT 08 is de- 
fined by a quality parameter Q (Bridle et al. 2008), so that 
the shear variance is 



<((si> - giY + m - 92Y) 



10" 



Q 



(10) 



where gi is the shear estimated from a plate of 10000 galaxy 
images at constant shear and the same PSF, gt is the true 
shear, and {gi) is estimated from an ensemble of differ- 
ent simulated plates with the same shear. For noisy sim- 
ulations results are quoted for Q estimated from the ex- 
pectation value from a set of simulations with different 
PSF and true shears. The target for fu ture observations is 
Q ~ 0(1000) ( |Amara fc Refregiei] p008| ) , and current meth- 
ods typically give Q < 0(100). Biases on gi therefore need 
to be below < 3 x 10 -4 level, or a typical fractional shear 
error of less than about a percent. For low-noise images the 
definition is simply to take each plate separately 



Q = 



10" 



<(ffl - 5l) 2 + {92 ~ ff2) 2 )pl a tos ' 



(11) 



with the stacking method described here giving Q ~ 300. 
The errors have a contribution from any systematics error 
and intrinsic shape noise (which may be significantly higher 
than possible due to the lossy nature of the stacking proce- 
dure). Most methods used with current data give Q < 30. 

When the noise is significant the method is no longer 
strictly valid due to centroid issues, however using the cen- 
troid error correction described above still gives Q ~ 130, 
which is at as good as other existing methods at the 
time of this work, and within a factor of two of the best 
method eventually winning the GREAT08 challenge. How- 
ever the stacking method is more reliant on the non-realistic 
constant-shear assumption than some other methods, so the 
main use may be as a baseline for simulation-based compar- 
isons with better codes. 

The fact that Q > 100 can be obtained by this sub- 
optimal method, making essentially no assumptions about 
the galaxy distribution, is perhaps encouraging evidence 
that there will exist a better method that is good enough for 
precision cosmology using only modest assumptions about 
the galaxy distribution. There is some evidence for shear- 
calibration bias in the stacking results, with a tendency for 
\g\ to be too large. More careful modelling of the centroid 
error, for example using better model fitting and an iterative 
shear estimate, could probably reduce the systematic error. 
Suitable time-consuming adjustment of the method param- 
eters may also allow the method to perform significantly 
better. 



2 Thanks to Garv Bernstein for nointinp- out this issue. 


3 httD://ww 


w . inrialpes . f r/bipop/people/guilbert/newuoa/ 


Qewuoa.html 





5 CONCLUSIONS 

I h ave revis it ed th e simple shear-estimation stacking method 
of Kuijken ( 199E ) , and shown that it still makes a useful 
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baseline that can compare favorably with currently used 
methods in idealized cases. Although it only works straight- 
forwardly over regions with constant shear, it can be a useful 
test case, and help to understand possible sources of bias in 
other methods. Stacking has the advantage of giving results 
that are unbiased almost independent of the unknown distri- 
bution of unlensed galaxy shapes. Residual biases enter at a 
lower level, for example through correlations of the centroid 
error with galaxy shape. With low noise the method can 
produce accurate results, comparing favourably with meth- 
ods that fit galaxy models that are known to be unrealistic. 
This should be unsurprising: Bayesian methods generally 
give the right results only if the correct model is used and 
priors truly reflect beliefs. Only in the very special case of no 
observational PSF does fitting any model to galaxy shapes 
give unbiased answers; a general PSF breaks all the symme- 
tries, requiring accurate modelling of both the PSF and the 
unlensed galaxy shape distribution to get the right result. 

Even if individual noisy galaxies are well fit by a simple 
galaxy model due to the large noise, if in reality galaxies 
have significant substructure or un-modelled shape varia- 
tions, the combined high-precision shear estimate from fit- 
ting many galaxies separately may be biased due to the 
inconsistent shape modelling. It is possible the modelling 
bias is negligible, but unless carefully proven analytically or 
demonstrated numerically in realist ic simulations it would 
be safer to assume otherwise (see Voigt & Bridle (200!:) 



for a quantitative analysis of the significant bias in vari- 
ous idealized cases). The noise- free stacking procedure is by 
construction linear in the galaxies, which is why substruc- 
ture variations between galaxies effectively cancel. However 
fitting to individual galaxies is usually a non-linear proce- 
dure, and there is no reason to expect errors to cancel more 
generally. Future work may however be able to find fairly 
model-independent methods that can be applied to fitting 
individual galaxies, significantly improving on the stacking 
method both in terms of signal to noise, and in terms of 
application to more realistic cases. If not, stacking methods 
may still be useful. Future work could investigate how to 
apply stacking in more realistic cases where the shear varies 
from galaxy to galaxy, and the PSF can only be estimated 
locally with significant noise. At leading order, a fit to a 
stacked galaxy constructed over a region with small varia- 
tions in shear should be probing the appropriately averaged 
shear. With a high galaxy density the corresponding sup- 
pression of small-scale power may be acceptable if it can be 
accurately modelled without significant bias. 
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APPENDIX A: ANALYTIC EXAMPLE 

Here we consider a very simple toy distribution of galaxies 
where we can attempt to calculate some things analytically. 



Consider the case where each galaxy has a Gaussian-shaped 
profile 



Iu(x\Qi) = A- 



-x T Q^ 1 x/2 



\Qi\ 



u/2 



(Al) 



where the distribution of the covariance Qi of each galaxy 
is drawn from a 2-dimen si onal inverse Wishart distribution 
(see e.g. Supta & Nagar ( 1999| ) for review and results used 
below) 



P{Q) 



|^|(n-3)/2 e -iTr(*Q- 1 ) 



(A2) 



2 n-3 7r i/2 r [( n _ 3)/2]r[(n - 4)/2]|Q|"/2 

where n > 6. Since we assume the unlensed distribution is 
statistically isotropic = (n — 6)<t 2 / where a g is the aver- 
age galaxy width. The parameter n determines how broad 
the galaxy shape distribution is, with n — > oo correspond- 
ing to a distribution of identical circular Gaussian galaxies. 
Typical galaxy ellipticities are 0(n~ 1 ^ 2 ) with 

((Q11-Q22) 2 ) _ (4gfg) 1 
((Q11 + O22) 2 ) ((Q11+Q22) 2 ) n-6' 1 ' 
The parameter w governs how the magnitude varies, with 
w = corresponding to all galaxies having the same peak 
amplitude, and w = 1 corresponds to them all having equal 
integrated light. 

The averaged galaxy profile (with equal weight) is given 

by 



fu(x) = J dQP(Q)7„(x|Q) 



Tin + w - 41 



r[n-4]|*| w / 2 (l + x T *- 1 x)(™+™- 3 )/ 2 ' ^ 

As expected this becomes the same as the individual galaxy 
shape as n — > 00. The covariance can be calculated similarly 

as 

cov(x,y)= j dQP(Q)/«(x|Q)/ u (y|Q) - J„(x)J u (y) 

Y[n + 2w - 4] 



A 

r[n-4]|*|"' 



|J + «Jr-l(xX T + yyT)|(n+2 ro -3)/2 

r[n + W - 4] 2 



T[n - 4][(1 + X T *- 1 X)(I + yT*-ly)](n+»-3)/2 



(A5) 



Note that 



|/ + *- 1 (xx T +yy T )| = 

(1 + x T * _1 x)(l + y T *~V) - (x T * _1 y) 2 - (A6) 

The covariance is determined by the number of degrees of 
freedom governing the population, so that with a simple 
model the number of significantly non-zero eigenvalues is 
small. In the case here each galaxy shape is determined by 
the three independent numbers in Q. 

Using this analytic galaxy population model we can 
compare the errors (e.g. estimating the shear matrix S such 
that * = S T * S) using stacking compared to what could 
be done using an optimal analysis. If Qi were simply mea- 
sured directly from each galaxy (in the low-noise limit) then 
the optimal expected error is 



E 



d 2 In P(QQ 

dg a dg b 



9 = 



4N(n - 3) ' 



(A7) 
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where N is the number of galaxies. The corresponding error 
per galaxy is o\ — 02 = l/(2^/n — 3), where <Ji is the error 
on gi. For n — 7 this corresponds to an error per galaxy 
ffi = 0.25. 

Using stacked galaxies with w = in fact gives the 
same average error per galaxy, with the errors increasing 
only slightly for w ~ 0(1). In this noise- free case with known 
distributions and centroids, the stacking method is close to 
optimal. To show that with w = stacking gives optimal 
answers we only need to show that the full likelihood can be 
written in terms of the stacked image. Since 

-21nP(*|{Qi}) = J2 [TrC^Qr 1 ) - (n - 3) In |*|] +const, 

(A8) 

where the last term is independent of "J/, a sufficient statistic 
is Q7 1 ■ However this can be measured by taking deriva- 
tives of the perfect stacked image 



In 



A 

N 



\ - -x 5 



Q x 



(A9) 



Voigt L., Bridle S., 2009, Fundamental limits to lensing 
shear estimation for model fitting methods, in prepara- 
tion. 



at the origin, and hence stacking is lossless for measuring 
the shear in this ideal case. Since the number of degrees of 
freedom in the galaxy model is small, the stacked image does 
not actually need to be densely sampled to obtain close to 
optimal results. 

In the zero-noise limit with infinite resolution, a known 
PSF can simply be deconvolved, so the above results also 
apply to PSF-smeared noise-free galaxies. Noise can be ac- 
counted for by adding an appropriate term to Eq. (A5) if 



the centroids are known, and will increase the expected er- 
ror per galaxy. Analysing more realistic cases analytically is 
challenging. 
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